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1.    INTRODUCTION 

Corresponding     to      any      time     series      {X,;i  -  0,1,2,  ■•■}      is      the     time     series 

{X,  ;  j  =  0,1,2,  •  •  ■  },  where 

b 
X,  =  b'l^X{j_1)b+h  (1) 

h  =  l 

is  the  jth  batch  mean  of  size  6  .  The  batch-means  process,  or  aggregated  time  series,  is 
of  interest  when  observations  are  actually  batched  (see,  e.g.,  Telser  [1967],  Amemiya 
and  Wu  [1972],  and  Tiao  [1972]),  when  calculations  need  to  be  simplified  (see,  e.g.. 
Blackman  and  Tukey  [1958,  Sec.  B.  17]),  or  when  the  process  mean,  E(A),  is  to  be 
estimated.    The  third  context  motivates  our  work. 

n 

Consider  estimating  E(A)  with  the  average  of  n  observations.  X  =  YjX>'  n  ■    tTsing 

« =  i 
batch  means  to  estimate  the  variance  of  the  sample  mean.  V  {X ),  has  long  been  con- 
sidered (see,  e.g.,  Conway  [1963]).    Brillinger  (1973)  shows  that  if  the  values  of  a  process 
at  a  distance  from  each  other  are  only  weakly  dependent,  then  the  batch  means  are 
asymptotically    independent    and   normally   distributed.     Thus,   for   large   batch   size   6 . 


V  (X )  can  be  estimated  using  S2k  /  k ,  where  k  -  [n  /  6J,  S2k  =  (k  -1)' 


;=i 


,  and  [  J 


denotes  the  floor  function.  (Moran  [1965]  discusses  related  estimators.)  A  nominal 
lOO(i-a)  percent  confidence  interval  on  E(X)  is  then  X  ±  t1_Q,  2Jt_1  Sk  k~l/2,  where 
*i-a/2,*-i  is  tne  !-(<*/  2)  quantile  of  Student's  t  distribution  with  k  - 1  degrees  of  freedom. 

The  batch  means  algorithms  developed  by  Mechanic  and  McKay  (1966),  Fishman 
(1978),  Law  and  Carson  (1979).  Schriber  and  Andrews  (1979).  and  Adam  (1983)  empiri- 
cally calculate  measures  of  batch  dependency  for  various  batch  sizes  b  in  an  attempt  to 
determine  a  reasonably  small  value  of  b  that  yields  batch  means  that  are  almost 
independent  and  normally  distributed.  These  procedures  require  substantial  calculation; 
Law  and  Carson  (1979),  for  example,  calculate  first-order  correlations  based  on  400 
batches.  That  so  many  batches  are  required  for  accurate  estimation  of  dependency 
measures  is  unfortunate,  since  Schmeiser  (1982)  shows  that,  for  fixed  n.  additional 
batches  beyond  some  small  number  (ten  to  thirty)  do  little  to  improve  the  statistical 
properties  of  the  batch  means  confidence  interval  procedures.  The  results  of  Section  2 
are  motivated  by  the  idea  that  knowledge  of  the  relationship  between  {.V,  }  and  {A;  }  can 
be  used  to  measure  properties  of  {A';  }  even  for  small  values  of  k  . 

A  second  reason  for  studying  the  relationship  between  {.V,}  and  {A';}  is  to  allow 
more  efficient  simulation  studies  of  batch-means  procedures.  Studying  the  performance 
of  several  batch-means  procedures  in  the  context  of  various  distributions  assumptions 
for  {A',}  requires  a  large  computational  effort,  especially  when  the  large  sample  sizes 
required  to  simulate  a  system  and  the  large  number  of  replications  required  for  meaning- 
ful conclusions  are  considered.  A  crude  Monte  Carlo  method  is  to  generate 
A,, A2,  •  •  •  ;X„    and    calculate    the    batch    means    A,,A2.  .A'|„  ■  ^    for   all    values   of  6    of 

interest.  A  computationally  more  efficient  alternative  is  to  derive  the  properties  of  {X}  } 
from  the  properties  of  {A,  }  and  to  generate  directly  the  batch  means  {A;  }.  as  discussed 
in  Section  4. 


A  third  motivation  is  that  direct  insights  might  result  from  studying  properties  of 
{X}  }  as  functions  of  the  parameters  of  {.Y,  }.  Particularly  interesting  is  the  sensitivity  of 
{Xj}  to  the  underlying  process  and  to  the  batch  size  6,  as  discussed  in  Kang  (1984, 
Chapter  5). 

Section  2  contains  results  relating  batch-means  processes  to  arbitrary  stationary 
autoregressive  moving-average  (ARMA)  processes.  Section  3  considers  the  special  case 
of  the  underlying  process  being  ARMA  (1,1).  Section  4  is  a  summary  containing  an 
algorithm  for  determining  the  batch-means  process  from  the  underlying  process. 


2.    BATCH  MEANS  OF  STATIONARY  ARMA  PROCESSES 

The  ARMA  (p  ,q)  process  {X, }  by  definition  satisfies 

£>**.--*  =    t.0hti-h  (2) 


fc=0  h=0 


where  <t>0  =  1,  60=  l,  and  the  error  terms  t,  are  independent  with  zero  mean  and  vari- 
ance a(-.  The  main  result  of  this  paper  is  Theorem  1,  which  states  that  batch-means 
processes  arising  from  ARMA  underlying  processes  are  themselves  ARMA  and  specifies 
the  parameter  values. 

Theorem  1.  Consider  the  stationary  ARMA(p  ,q)  process  of  equation  (2). 
Then  {X}  }  is  the  stat  onary  ARMA(p  ,q)  process 

p        _  »  _ 

h=0  h=0 

where  $0=l,  0O=1,  the  batch-means  error  terms  t ,  are  independent  random 
variables  with  zero  mean  and  variance  o2r  ,  and  q,  $x$i.  •  •  •  ,<fp ,  and 
0  \.0  ■>•  '  "  "  ,9-  are  functions  (of  the  parameters  of  the  underlying  process  and 
the  batch  size  b)  given  in  Lemmas  1,  2,  and  4-  respectively. 

The  proof  of  Theorem  1  requires  the  following  lemmas. 

Lemma  1  (Anderson  [1979a,  p.  155]).  If  the  underlying  process  {X,}  is  a 
stationary  ARMA(p  .q)  process,  then  the  batch-means  process  {A';  }  is  a  sta- 
tionary ARMA(p  ,q)  process,  where  q  =  p  -[(p-q)/  b\. 

q   -  (p+l)(fc-l) 


Anderson  uses  the  more  complicated,  but  equivalent,  expression  q  = 
Lemma  1  has  several  direct  implications,  as  discussed  in  the  Appendix. 

Lemma  2     (Amemiya  and  Wu  [1972]).     The  AR  parameters  0,,^2.  ■  •  •  ,op 

of  the    batch   means   process   {.V;  }    are    the.    coefficients   of  B\B~.         .Bp    of 
p 
~[(l-uhb  B  ),   respectively,    where  q,.u,,  ■  ■  ■  ,ap    are   the  roots  of  the  charac- 

h=  1 

P 

teristic  equation  $(B)  =   Yj^hB"    h   -  0. 

h  -0 


Lemma  3.     For  any  stationary  process  {A,  },   the  lag-h    autocorrelation  of 
the  batch-means  process  {A; }  is 


ph  =  Corr  (Xj ,  XJ+h  )  = 


b  6-1 

Li1 P(k-l)b+i    +     2j  '  P[h-l)b+2b-, 
i  =  l  i  =  1 


l\bc 


b-\ 


where  c  =  I  +  2^{l-(h  /  b))ph ,  ph  =  Rh  /  RQ,  and 


h  =  l 


Rh  =  E\(Xi-E  A)(A,+/l-E  X)\  for  h  =  0,1,2,     •      and  i  =  0,1,2, 

Proof.  For  any  stationary  process,  the  lag-/t  covariance  of  the  batch 
means  process  is 

Cov  {X, ,  X)+h  )=  6"2  J]  SCov  (x(,-i)b+, ,  Xu+k_l)b+k  )  (3) 

i  =  i*  =1 

(see,  e.g.,  Kleijnen  [1975,  p.  507]).  which  is  a  special  case  of  the  covariance 
of  linear  combinations  of  random  variables  as  discussed  in  Box  and  Jenkins 
(1976,  pp.  28-29).  Also,  for  any  stationary  process,  each  batch  mean  X, 
has  variance 

/Z0=V  (*,-)  =  e  R0/  b  (4) 

(see,  e.g.,  Fishman  [1973,  p.  281]).    The  definition  of  correlation 

ph  =  Cov(Xy>*y+Jk)/    [V(Xy)V(Xy+fc)]I/2 
and  equations  (3)  and  (4)  yield 

b        b 

ph  -  *>~2£  ECov  \x{j-i)b+i>  x{i+h-i)b+k]  I  \c  Ro  I  b\ 


i=i* 

4        b 


-    £  l]Corr  iA'(j_1)4  +l  ,  A'(; +/l  _1)t +A  j  /    \bc\. 
i=i*  =  i 

Counting  like  terms  arising  from  stationarity  yields  the  result,  a 

Lemma  4  (Anderson  [1971.  p.  237]).  Consider  a  stationary  ARMA{p  ,q) 
process  with  known  AR  (autoregressive)  parameters  $~!,<5"2,  '  '  '  ,<PP  »'  variance 
R0,  and  autocorrelation  coefficients  p},p2,  •••,/>-.  Then  the  MA  (moving- 
average)  parameters  0  x,0  ■,_.■■$  -  are  determined. 

Theorem  1  can  now  be  proven  using  the  four  lemmas. 

Proof  of  Theorem  1. 

1.  From  Lemma   1,  the  AR  and  MA  orders  of  {X,  }  are  determined:   in 
particular,  {.V,  }  is  an  ARMAfp  ,q  )  process. 

2.  From  Lemma  2.  the  AR.  parameters  of  {A',  }  are  determined. 

3.  The  autocorrelations  pi,p2,         <Pb[q+\) -i  °f  a  stationary  ARMA  process 
can  be  calculated  using  the  algorithm  of  Sweet  and  Mazaheri  (1979). 


4.  Given  the  autocorrelations  from  Step  3,  the  batch-means  variance  R 0 
is  determined  by  equation  (4)  and  the  batch-means  autocorrelations 
Pu Pi,  '  '  '  ,P-  are  determined  by  Lemma  3. 

5.  The  MA  parameters  9l79~2,  ■  •  •  ,6  -   are  then  determined  from  Lemma 

4.D 

The  representation  of  the  batch-means  process  is  not  unique.  The  batch-means 
MA  parameters  of  Lemma  4  require  the  solution  of  a  polynomial  equation  of  order  2q  to 
determine  9  x,9~2,  ■  ■  ■  ,9 -.  The  2q  roots  can  be  partitioned  2'  ways  into  two  sets 
(x1,x2,  ■  ■  ■  ,x-)  and  (zq+vZq+2,  '  •  '  ,z2?)  sucn  that  x-+i  =  1/  z, .  All  such  subsets  of  size  q 
from  (x1,x2,  ■  ■  ■  ,z2-)  determine  9  u9 2,  ■  •  ■  ,9  -  corresponding  to  stochastically  equivalent 
processes.  But  there  is  a  unique  subset  of  size  q  having  j  z,  |  ^  1,  and  therefore 
z-+,  •  =  |  1/  x,-  j  ^  1,  for  i  =  1,2,  •  •  •  ,q .  Thus  for  the  ARMA(p  ,q )  process  there  exist 
2'  -  1  non-invertible  processes  corresponding  to  a  unique  invertible  process. 


3.    ARMA(1,1)  BATCH  MEANS 

Now  consider  the  special  case  of  stationary  first-order  ARMA  processes 

X,   +  (pXi-x  =  c,-  +  (9e,_i         for  i  =  1.2,  •  •  •  (5) 

The  low-order  moments  of  {X, }  are  the  zero  mean,  variance 

R0=  a(2{\^  92-  2<f>9)  /   (W2),  (6) 

the  lag-one  autocorrelation 

Px  =  (1  -  <t>9){9  -  <p)  I    (l  +  92  -  2<p9)  .  (7) 

and  lag-h  autocorrelations 

Ph  =  (~<l>)Ph-i=  {-4>)h~lPi  for/i  =  2,3,  •  ••  .  (8) 

Closed-form  expressions  for  the  parameters  of  the  batch-means  process  are  given  in 
Theorem  2. 

Theorem  2.  Consider  the  stationary  ARM  A  (1,1)  process  of  equation  (5). 
The  corresponding  batch- means  process  is  the  stationary  ARM A(  1,1)  pro- 
cess 


where 


Xj  +?*,-_,=  F;-  +0Fy_,         for  j  =  1,2,  ••■  (9) 

?=  (-l)i+V  (10) 

(2^,  +  r-+  i)  -  \(i-^2)(i-(2pJ+^)2yi/2 

9   =   (11J 

2(p,  +  9") 


and 


ar2  =  R0  (l-$2)  /    (P  -  1*9   +  1)  (12) 
where 

R0=  c  R0/  b  (13) 

Pi=  P,(l+^)2/    \bc  (1  +  ^)2]  (14) 
and 

c  =  1  +  {2^1[(-<^)i  +  b  (1  +  *)  -  1]  /    [6  (1+0)2]}  (15) 

Proof  of  Theorem  2. 

Equations  (13),  (14),  and  (15)  follow  from  Lemma  3  and  equation  (4) 
via  equation  (8).  Since  {JT,-}  is  ARMA(l.l),  {X, }  is  also  ARMA(l.l)  by 
Lemma  1.  that  is.  equation  (9)  holds.  Equation  (10)  is  a  special  case  of 
Lemma  2.    Since  {X} }  is  ARMA(l.l).  equation  (7)  yields 

P,  =  [\--$e)(6-<F)  I    [l+P-2^9)  . 

Solving  for  0  yields  the  two  roots  of  equation  (11).  Since  {X}  }  is  an 
ARMA(Ll)  process,  equation  (6)  holds  with  the  batch-means  parameters: 

R0  =  or2{l+P-2$0)  /    (1-^T2)  ■ 

Solving  for  the  variance  of  the  batch-means  error  term,  a2,  with  either 
value  of  0  from  equation  (11)  yields  equation  (12).  (But  note  that  the 
value  of  a2  depends  on  the  choice  of  d  ).  □ 

4.    SUMMARY 

Properties  of  batch  means  are  studied  under  the  assumption  that  the  underlying 
process  is  ARMA(p,?).  For  ARMA(l,l)  processes,  closed-form  expressions  for  the 
corresponding  batch-means  processes  are  obtained.  A  numerical  procedure  is  developed 
for  calculating  the  parameters  of  the  ARMA  batch-means  process  from  the  parameters 
of  the  underlying  process  and  the  batch  size  6  .  This  procedure  is  stated  concisely  here 
for  convenience. 

ARMA(p.q)  Procedure. 

Given  parameters  <t>i,<f>2,  '  '  '  ,4>p  •  0  ,,0  2,  •  •  •  ,8  q .  error  variance  a,-,  and  batch 
size  b  ,  calculate 

1.  q  =  p    ~[{p-q)/   b\ 

2.  ^,,p2,  '  •  '  ,4>p  using  Lemma  2 

3.  R0,pl:p2,  i/'wj-h)   i   from  the  Yule- Walker  equations,  probably  using 
the  algorithm  of  Sweet  and  Mazaheri  (1979) 

4.  c  =  1+  2J(1-(A/  b))Ph 

h  =  1 


5.  R0  from  equation  (4) 

6-  Pi,P2,  '  "  '  ,Pj  using  Lemma  3 

7.  B  ,,0  2,  •  •  ■  ,9  f  from  Lemma  4 

8.  a-2  from  equation  (13)  of  Anderson  (1971,  p.  237). 

A  FORTRAN  implementation  of  the  ARMA(p,?)  procedure  is  given  in  Kang  (1984). 

When  the  underlying  process  is  ARMA(1,1),  the  following  closed-form  procedure 
can  be  used: 

ARMA(1,1)  Procedure. 

Given  parameters  4>,  0  ,  error  variance  <rf2,  and  batch  size  6  ,  calculate 

1.  q  =  1 

2.  p  from  equation  (10) 

3.  R0  from  equation  (6),  p,  from  equation  (7) 

4.  c  from  equation  (15) 

5.  R0  from  equation  (13) 

6.  px  from  equation  (14) 

7.  6  from  equation  (11) 

8.  ar2  from  equation  (12) 

Notice  in  the  ARMA(l,l)  special  case  that  calculation  in  step  3  of  all  26-1  autocorrela- 
tions of  the  underlying  process  is  not  necessary. 

If  the  underlying  error  terms  i ,  are  normally  distributed,  then  the  batch-means 
error  terms  Ti  are  also  normally  distributed  (see,  e.g.,  Johnson  and  Kotz  [1971,  p.  51]). 
Therefore,  generation  of  random  variates  directly  from  the  batch-means  process  is 
straightforward  using  equation  (9),  thereby  avoiding  the  costly  computations  of  aggre- 
gating observations  from  the  underlying  process.  Initialization  for  steady-state  results  is 
straightforward  for  AR  and  MA  processes,  but  initialization  for  ARMA  is  complicated 
unless  the  process  is  warmed-up  by  discarding  some  initial  observations  (Anderson 
[1979b]). 
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APPENDIX 

Although  Lemma  1   is  simple  to  state  compactly,  its  implications  are  more  clear 
when  cases  are  considered  individually: 


If 

and 

then 

P  ><7 

b  <p  -q 

q^q<p -1 

b=p  -q 

q=p  -1 

b  >p  -q 

q=p 

P  =9 

q=p 

p  <q 

b  <q-p 

p +2$  q^q 

b  =  q-p 

q  =  p  +1 

b  >q-p 

q  =  p+i 

Many   results   can   be  stated   immediately   from  examination  of  these   individual 
cases.    Five  such  results  are: 

Result    1.     //  {X,}    is   AR{p),    then   {X,  }    is   ARMA{p  ,q),    as   studied   by 
Amemiya  and  Wu  (1972).    Additionally,  1  ^  q  ^  p  . 

Result  2.  7/  {X,  }  is  MA{q),  then  {X, }  is  MA[q),  where  l^q^q. 

Result  3.     i/  {.Y,  }  is  AR  or  ARMA  with  batch  size  satisfying  0  ^  p  -q  <  b  , 
then  {X,}  is  ARMA{p  ,p). 

Result  4.    If  p  ^  q ,  then  lim  q  =  p  . 

b  —  00 


Result  5.    If  p  <  q  ,  then  lim  q  -  p+l. 

b  -»00 

Of  course,  considering  only  the  order  of  the  batch-means  process  can  be  mislead- 
ing. For  example,  Results  4  and  5  indicate  that  large  batches  lead  to  MA  components 
of  order  p  or  p+l;  in  particular,  a  batched  MA(g)  process  converges  to  an  MA(l)  pro- 
cess. But  large  batches  are  asymptotically  independent.  The  explanation  is  that  6  ,  is 
approaching  zero  as  batch  size  increases.  An  implication  is  that,  even  for  this  nicest 
case  of  ARMA  underlying  processes,  estimation  of  the  order  of  the  batch-means  process 
is  likely  to  be  difficult. 
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